Goto

Collaborating Authors

 conservation property


Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability

arXiv.org Artificial Intelligence

The development of effective explainability tools for Transformers is a crucial pursuit in deep learning research. One of the most promising approaches in this domain is Layer-wise Relevance Propagation (LRP), which propagates relevance scores backward through the network to the input space by redistributing activation values based on predefined rules. However, existing LRP-based methods for Transformer explainability entirely overlook a critical component of the Transformer architecture: its positional encoding (PE), resulting in violation of the conservation property, and the loss of an important and unique type of relevance, which is also associated with structural and positional features. To address this limitation, we reformulate the input space for Transformer explainability as a set of position-token pairs. This allows us to propose specialized theoretically-grounded LRP rules designed to propagate attributions across various positional encoding methods, including Rotary, Learnable, and Absolute PE. Extensive experiments with both fine-tuned classifiers and zero-shot foundation models, such as LLaMA 3, demonstrate that our method significantly outperforms the state-of-the-art in both vision and NLP explainability tasks. Our code is publicly available.


A conservative hybrid physics-informed neural network method for Maxwell-Amp\`{e}re-Nernst-Planck equations

arXiv.org Artificial Intelligence

Maxwell-Amp\`{e}re-Nernst-Planck (MANP) equations were recently proposed to model the dynamics of charged particles. In this study, we enhance a numerical algorithm of this system with deep learning tools. The proposed hybrid algorithm provides an automated means to determine a proper approximation for the dummy variables, which can otherwise only be obtained through massive numerical tests. In addition, the original method is validated for 2-dimensional problems. However, when the spatial dimension is one, the original curl-free relaxation component is inapplicable, and the approximation formula for dummy variables, which works well in a 2-dimensional scenario, fails to provide a reasonable output in the 1-dimensional case. The proposed method can be readily generalised to cases with one spatial dimension. Experiments show numerical stability and good convergence to the steady-state solution obtained from Poisson-Boltzmann type equations in the 1-dimensional case. The experiments conducted in the 2-dimensional case indicate that the proposed method preserves the conservation properties.


Uncertainty Estimation for Safe Human-Robot Collaboration using Conservation Measures

arXiv.org Artificial Intelligence

We present an online and data-driven uncertainty quantification method to enable the development of safe human-robot collaboration applications. Safety and risk assessment of systems are strongly correlated with the accuracy of measurements: Distinctive parameters are often not directly accessible via known models and must therefore be measured. However, measurements generally suffer from uncertainties due to the limited performance of sensors, even unknown environmental disturbances, or humans. In this work, we quantify these measurement uncertainties by making use of conservation measures which are quantitative, system specific properties that are constant over time, space, or other state space dimensions. The key idea of our method lies in the immediate data evaluation of incoming data during run-time referring to conservation equations. In particular, we estimate violations of a-priori known, domain specific conservation properties and consider them as the consequence of measurement uncertainties. We validate our method on a use case in the context of human-robot collaboration, thereby highlighting the importance of our contribution for the successful development of safe robot systems under real-world conditions, e.g., in industrial environments. In addition, we show how obtained uncertainty values can be directly mapped on arbitrary safety limits (e.g, ISO 13849) which allows to monitor the compliance with safety standards during run-time.


Training neural networks under physical constraints using a stochastic augmented Lagrangian approach

arXiv.org Machine Learning

We investigate the physics-constrained training of an encoder-decoder neural network for approximating the Fokker-Planck-Landau collision operator in the 5-dimensional kinetic fusion simulation in XGC. To train this network, we propose a stochastic augmented Lagrangian approach that utilizes pyTorch's native stochastic gradient descent method to solve the inner unconstrained minimization subproblem, paired with a heuristic update for the penalty factor and Lagrange multipliers in the outer augmented Lagrangian loop. Our training results for a single ion species case, with self-collisions and collision against electrons, show that the proposed stochastic augmented Lagrangian approach can achieve higher model prediction accuracy than training with a fixed penalty method for our application problem, with the accuracy high enough for practical applications in kinetic simulations.